12 research outputs found
Listener Background in L2 Speech Evaluation
Listeners are integral parts of second language (L2) oral performance assessment. However, evaluation of listeners is susceptible to listener background variables and biases. These variables and preexisting biases distort native speaker (NS) listenersā perceptions of non-native speakersā (NNSs) speech performance and contribute errors into their oral performance assessment. Among listener background variables, listenersā first language status, the amount of exposure to different English varieties, listenersā educational background, prior language teaching experience, NNSsā linguistic stereotyping, and listener attitude have been investigated in the literature and assumed to exert sizable amount of variation in speakersā oral proficiency true scores. To minimize listenersā bias in the assessment context, listeners are provided with intensive training programs in which they are trained how to rate NNSsā speech more objectively utilizing scoring rubrics. To mediate listenersā bias in social contexts, the literature has provided strands of evidence in favor of structured intergroup contact programs, which are inoculations particularly devised to improve NSsā attitude, thereby making them more receptive to NNSsā English varieties. To enhance L2 listenersā self-efficacy and foster their autonomy, L2 instructors are encouraged to emphasize explicit instruction of listening strategies
The effects of situational contexts and occupational roles on listenersā judgements on accented speech
Much language attitude research has demonstrated that people make biased judgements based on speakersā language choice and accent. However, the influence of occupational context on listenersā perceptions of accented English is largely unknown. This verbal guise study examined the extent to which academic contexts and workforce-related professional contexts affect listenersā judgements of accented speech. Results revealed that simulated contexts made a significant difference in listenersā perceptual judgements, with speakers perceived as significantly more comprehensible and acceptable in service-occupational roles than in academic contexts. These findings suggest that listenersā speech judgements can be heavily influenced by speakersā situational contexts. The study also provides evidence in support of the fluency principle, showing that listeners may evaluate accented speech more negatively if it requires more processing effort. The findings inform the domains of linguistic stereotyping and listenersā attitudes towards accented speech
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment
Current leading mispronunciation detection and diagnosis (MDD) systems
achieve promising performance via end-to-end phoneme recognition. One challenge
of such end-to-end solutions is the scarcity of human-annotated phonemes on
natural L2 speech. In this work, we leverage unlabeled L2 speech via a
pseudo-labeling (PL) procedure and extend the fine-tuning approach based on
pre-trained self-supervised learning (SSL) models. Specifically, we use Wav2vec
2.0 as our SSL model, and fine-tune it using original labeled L2 speech samples
plus the created pseudo-labeled L2 speech samples. Our pseudo labels are
dynamic and are produced by an ensemble of the online model on-the-fly, which
ensures that our model is robust to pseudo label noise. We show that
fine-tuning with pseudo labels achieves a 5.35% phoneme error rate reduction
and 2.48% MDD F1 score improvement over a labeled-samples-only fine-tuning
baseline. The proposed PL method is also shown to outperform conventional
offline PL methods. Compared to the state-of-the-art MDD systems, our MDD
solution produces a more accurate and consistent phonetic error diagnosis. In
addition, we conduct an open test on a separate UTD-4Accents dataset, where our
system recognition outputs show a strong correlation with human perception,
based on accentedness and intelligibility.Comment: Accepted to Interspeech 202